AITopics | deep relu net

Collaborating Authors

deep relu net

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

39555391eb0624a439c5131b1bb8a2e0-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-11-2026, 22:23:32 GMT

dependence, hanin and sellke, miller and hardt, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Add feedback

39555391eb0624a439c5131b1bb8a2e0-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 13:24:20 GMT

artificial intelligence, hanin and sellke, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Add feedback

Generalization Performance of Empirical Risk Minimization on Over-parameterized Deep ReLU Nets

Lin, Shao-Bo, Wang, Yao, Zhou, Ding-Xuan

arXiv.org Artificial IntelligenceFeb-28-2023

In this paper, we study the generalization performance of global minima for implementing empirical risk minimization (ERM) on over-parameterized deep ReLU nets. Using a novel deepening scheme for deep ReLU nets, we rigorously prove that there exist perfect global minima achieving almost optimal generalization error bounds for numerous types of data under mild conditions. Since over-parameterization is crucial to guarantee that the global minima of ERM on deep ReLU nets can be realized by the widely used stochastic gradient descent (SGD) algorithm, our results indeed fill a gap between optimization and generalization.

artificial intelligence, deep relu net, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2111.14039

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Depth Selection for Deep ReLU Nets in Feature Extraction and Generalization

Han, Zhi, Yu, Siquan, Lin, Shao-Bo, Zhou, Ding-Xuan

arXiv.org Machine LearningApr-1-2020

Deep learning is recognized to be capable of discovering deep features for representation learning and pattern recognition without requiring elegant feature engineering techniques by taking advantage of human ingenuity and prior knowledge. Thus it has triggered enormous research activities in machine learning and pattern recognition. One of the most important challenge of deep learning is to figure out relations between a feature and the depth of deep neural networks (deep nets for short) to reflect the necessity of depth. Our purpose is to quantify this feature-depth correspondence in feature extraction and generalization. We present the adaptivity of features to depths and vice-verse via showing a depth-parameter trade-off in extracting both single feature and composite features. Based on these results, we prove that implementing the classical empirical risk minimization on deep nets can achieve the optimal generalization performance for numerous learning tasks. Our theoretical results are verified by a series of numerical experiments including toy simulations and a real application of earthquake seismic intensity prediction.

artificial intelligence, deep net, machine learning, (16 more...)

arXiv.org Machine Learning

2004.00245

Country:

Asia > China > Liaoning Province > Shenyang (0.04)
Asia > China > Hong Kong (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Realization of spatial sparseness by deep ReLU nets with massive data

Chui, Charles K., Lin, Shao-Bo, Zhang, Bo, Zhou, Ding-Xuan

arXiv.org Machine LearningDec-16-2019

--The great success of deep learning poses urgent challenges for understanding its working mechanism and rationality. The depth, structure, and massive size of the data are recognized to be three key ingredients for deep learning. In this paper, we aim at rigorous verification of the importance of massive data in embodying the out-performance of deep learning. T o approximate and learn spatially sparse and smooth functions, we establish a novel sampling theorem in learning theory to show the necessity of massive data. We then prove that implementing the classical empirical risk minimization on some deep nets facilitates in realization of the optimal learning rates derived in the sampling theorem. This perhaps explains why deep learning performs so well in the era of big data. With the rapid development of data mining and knowledge discovery, data of massive size are collected in various disciplines [50], including medical diagnosis, financial market analysis, computer vision, natural language processing, time series forecasting, and search engines. These massive data bring additional opportunities to discover subtle data features which cannot be reflected by data of small size while creating a crucial challenge on machine learning to develop learning schemes to realize benefits by exploring the use of massive data. Although numerous learning schemes such as distributed learning [26], localized learning [32] and sub-sampling [14] have been proposed to handle massive data, all these schemes focused on the tractability rather than the benefit of massiveness. Therefore, it remains open to explore the benefits brought from massive data and to develop feasible learning strategies for realizing these benefits. Deep learning [18], characterized by training deep neural networks (deep nets for short) to extract data features by using rich computational resources such as computational power of modern graphical processor units (GPUs) and custom processors, has made remarkable success in computer vision [23], speech recognition [24] and game theory [40], practically showing its power in tackling massive data. C.K. Chui is also associated with the Department of Statistics, Stanford University, CA 94305, USA. Shao-Bo Lin is with the Center of Intelligent Decision-making and Machine Learning, School of Management, Xi'an Jiaotong University, Xi'an, China.

artificial intelligence, deep net, machine learning, (15 more...)

arXiv.org Machine Learning

1912.07464

Country:

Asia > China > Shaanxi Province > Xi'an (0.44)
North America > United States (0.24)
Asia > China > Hong Kong (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games (0.34)
Health & Medicine > Diagnostic Medicine (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback